Probabilistic Models for Common Spatial Patterns: Parameter-Expanded EM and Variational Bayes
نویسندگان
چکیده
Common spatial patterns (CSP) is a popular feature extraction method for discriminating between positive and negative classes in electroencephalography (EEG) data. Two probabilistic models for CSP were recently developed: probabilistic CSP (PCSP), which is trained by expectation maximization (EM), and variational Bayesian CSP (VBCSP) which is learned by variational approximation. Parameter expansion methods use auxiliary parameters to speed up the convergence of EM or the deterministic approximation of the target distribution in variational inference. In this paper, we describe the development of parameter-expanded algorithms for PCSP and VBCSP, leading to PCSP-PX and VBCSPPX, whose convergence speed-up and high performance are emphasized. The convergence speed-up in PCSPPX and VBCSP-PX is a direct consequence of parameter expansion methods. The contribution of this study is the performance improvement in the case of CSP, which is a novel development. Numerical experiments on the BCI competition datasets, III IV a and IV 2a demonstrate the high performance and fast convergence of PCSP-PX and VBCSP-PX, as compared to PCSP and VBCSP. Introduction Electroencephalography (EEG) is the recording of electrical potentials at multiple sensors placed on the scalp, leading to multivariate time series data reflecting brain activities. EEG classification is a crucial part of non-invasive brain computer interface (BCI) systems, enabling computers to translate a subject’s intention or mind into control signals for a device such as a computer, wheelchair, or neuroprosthesis (Wolpaw et al. 2002; Ebrahimi, Vesin, and Garcia 2003; Cichocki et al. 2008). Common spatial patterns (CSP) is a widely-used discriminative EEG feature extraction method (Blankertz et al. 2008; Koles 1991; Müller-Gerking, Pfurtscheller, and Flyvbjerg 1999; Kang, Nam, and Choi 2009), also known as the Fukunaga-Koontz transform (Fukunaga and Koontz 1970), where we seek a discriminative subspace such that the variance for one class is maximized while the variance for the Copyright c © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. other class is minimized. CSP was recently cast into a probabilistic framework (Wu et al. 2009), where a linear Gaussian model for each of the positive/negative classes was considered and the maximum likelihood estimate of the basis matrix shared across two models (positive and negative class models) was shown to yield the same solution as CSP. Bayesian models were also proposed for CSP (Wu et al. 2010; Kang and Choi 2011), where posterior distributions over variables of interest are estimated by variational approximation. We revisit two probabilistic models for CSP. One is probabilistic CSP (PCSP) (Wu et al. 2009) where the maximum likelihood estimate is determined by the expectation maximization (EM) optimization and the other is variational Bayesian CSP (VBCSP) (Kang and Choi 2011) where the posterior distributions over variables in the model are computed by variational inference in the framework of Bayesian multi-task learning (Heskes 2000). EM and variational inference, while successful, often suffer from slow convergence to the solution. Parameter eXpanded-EM (PX-EM) (Liu, Rubin, and Wu 1998) is a method for accelerating EM, using the over-parameterization of the model. The underlying idea in PX-EM is to use a covariance adjustment to correct the analysis of the M step, thereby exploiting extra information captured in the imputed complete data. Similarly, Parameter eXpanded-VB (PX-VB) (Qi and Jaakkola 2007) expands a model with auxiliary parameters to reduce the coupling between variables in the original model, so that it accelerates the deterministic approximation of the target distribution in variational Bayesian inference. In this study, we employ the parameter-expansion methods of (Liu, Rubin, and Wu 1998; Qi and Jaakkola 2007; Luttinen and Ilin 2010) in order to develop parameterexpanded algorithms for PCSP and VBCSP, leading to PCSP-PX and VBCSP-PX. By capitalizing on the convergence speed-up by parameter-expansion methods, we show that the expanded models, PCSP-PX and VBCSP-PX, converge to solutions faster than PCSP and VBCSP. In addition, we show that the generalization performance of PCSP-PX and VBCSP-PX is better than that of PCSP and VBCSP. In PCSP and VBCSP, feature vectors are constructed using only variances of the expected latent variables so that the information on covariances is neglected. In contrast, the auxiliary parameters in PCSP-PX and VBCSP-PX re-
منابع مشابه
Probabilistic Models for Common Spatial Patterns: Parameter-Expanded EM and Variational Baye
Common spatial patterns (CSP) is a popular feature extraction method for discriminating between positive and negative classes in electroencephalography (EEG) data. Two probabilistic models for CSP were recently developed: probabilistic CSP (PCSP), which is trained by expectation maximization (EM), and variational Bayesian CSP (VBCSP) which is learned by variational approximation. Parameter expa...
متن کاملOn the Slow Convergence of EM and VBEM in Low-Noise Linear Models
We analyze convergence of the expectation maximization (EM) and variational Bayes EM (VBEM) schemes for parameter estimation in noisy linear models. The analysis shows that both schemes are inefficient in the low-noise limit. The linear model with additive noise includes as special cases independent component analysis, probabilistic principal component analysis, factor analysis, and Kalman filt...
متن کاملVariational Inference For Probabilistic Latent Tensor Factorization with KL Divergence
Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the PLTF framework. This paper presents full Bayesian inference via variational Bayes that facilitates more powerful modelling and allows more sophist...
متن کاملVideo Segmentation via Variational Bayes Mixture Models
Video modeling is of interest for many applications, including video indexing, video data compression, object detection, unusual-event detection and object recognition. Many approaches employ local (pixelbased) models. In Stauffer and Grimson’s work on background modeling [14], the intensity or color of each pixel is modeled as a mixture of Gaussians, where each mixing component represents one ...
متن کاملVariational EM Algorithms for Non-Gaussian Latent Variable Models
We consider criteria for variational representations of non-Gaussian latent variables, and derive variational EM algorithms in general form. We establish a general equivalence among convex bounding methods, evidence based methods, and ensemble learning/Variational Bayes methods, which has previously been demonstrated only for particular cases.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012